Efficient fair principal component analysis
نویسندگان
چکیده
It has been shown that dimension reduction methods such as Principal Component Analysis (PCA) may be inherently prone to unfairness and treat data from different sensitive groups race, color, sex, etc., unfairly. In pursuit of fairness-enhancing dimensionality reduction, using the notion Pareto optimality, we propose an adaptive first-order algorithm learn a subspace preserves fairness, while slightly compromising reconstruction loss. Theoretically, provide sufficient conditions solution proposed belongs frontier for all groups; thereby, optimal trade-off between overall loss fairness constraints is guaranteed. We also convergence analysis our show its efficacy through empirical studies on datasets, which demonstrates superior performance in comparison with state-of-the-art algorithms. The fairness-aware PCA can efficiently generalized multiple group features effectively reduce decisions downstream tasks classification.
منابع مشابه
Convex Formulations for Fair Principal Component Analysis
Though there is a growing body of literature on fairness for supervised learning, the problem of incorporating fairness into unsupervised learning has been less well-studied. This paper studies fairness in the context of principal component analysis (PCA). We first present a definition of fairness for dimensionality reduction, and our definition can be interpreted as saying that a reduction is ...
متن کاملEfficient Intrusion Detection Using Principal Component Analysis
Most current intrusion detection systems are signature based ones or machine learning based methods. Despite the number of machine learning algorithms applied to KDD 99 cup, none of them have introduced a pre-model to reduce the huge information quantity present in the different KDD 99 datasets. We introduce a method that applies to the different datasets before performing any of the different ...
متن کاملPrincipal Component Projection Without Principal Component Analysis
We show how to efficiently project a vector onto the top principal components of a matrix, without explicitly computing these components. Specifically, we introduce an iterative algorithm that provably computes the projection using few calls to any black-box routine for ridge regression. By avoiding explicit principal component analysis (PCA), our algorithm is the first with no runtime dependen...
متن کاملCompression of Breast Cancer Images By Principal Component Analysis
The principle of dimensionality reduction with PCA is the representation of the dataset ‘X’in terms of eigenvectors ei ∈ RN of its covariance matrix. The eigenvectors oriented in the direction with the maximum variance of X in RN carry the most relevant information of X. These eigenvectors are called principal components [8]. Ass...
متن کاملCommunication-efficient Algorithms for Distributed Stochastic Principal Component Analysis
We study the fundamental problem of Principal Component Analysis in a statistical distributed setting in which each machine out of m stores a sample of n points sampled i.i.d. from a single unknown distribution. We study algorithms for estimating the leading principal component of the population covariance matrix that are both communication-efficient and achieve estimation error of the order of...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Machine Learning
سال: 2022
ISSN: ['0885-6125', '1573-0565']
DOI: https://doi.org/10.1007/s10994-021-06100-9